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SEQUENCE LISTING 

(1) GENERAL INFORMATION: 

(i) APPLICANT: HOP WOOD , JOHN JOSEPH; SCOTT, HAMISH STEELE; 
WEBER, BIRGIT; BLANCH, LI ANNE; ANSON, DONALD STEWART 

(ii) TITLE OF INVENTION: SYNTHETIC MAMMALIAN 
Ot-N- ACETYLGLUCOSAMINIDASE AND GENETIC SEQUENCES ENCODING SAME 



(iv) CORRESPONDENCE ADDRESS : 

(A) ADDRESSEE: NIXON PEABODY LLP 

(B) STREET: 990 STEWART AVENUE 

(C) CITY: GARDEN CITY 

(D) STATE: NEW YORK 

(E) COUNTRY: UNITED STATES 

(F) ZIP: 11530 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPE: Floppy disk 

(B) COMPUTER: IBM PC compatible 

(C) OPERATING SYSTEM: PC-DOS /MS-DOS 

(D) SOFTWARE: Patentln Release #1.0, Version #1.25 

(vi) CURRENT APPLICATION DATA: 

(A) APPLICATION NUMBER: 09/077,354 

(B) FILING DATE: 22 -APRIL- 1 9 99 

(vii) PRIOR APPLICATION DATA: 

(A) APPLICATION NUMBER: PCT / US 96/00747 

(B) FILING DATE: 22 -NOV- 1996 

(viii) ATTORNEY/AGENT INFORMATION: 

(A) NAME: POKALSKY, ANN R. 

(B) REGISTRATION NUMBER: 34,697 

(C) REFERENCE /DOCKET NUMBER: 2249/104 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE: 516 742 4343 

(B) TELEFAX: 516 742 4366 



(2) INFORMATION FOR SEQ ID NO : 1 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 575 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

( D ) TOPOLOGY : 1 inear 



(iii) 



NUMBER OF SEQUENCES 




(ii) MOLECULE TYPE: cDNA 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(F) TISSUE TYPE: Peripheral Blood 

(G) CELL TYPE: Leukocyte 

(ix) FEATURE: 

(A) NAME / KEY : CDS 

(B) LOCATION: 102.. 2330 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 1 : 
CCCGGGCTTA GCCTTCGGGT CCACGTGGCC GGAGGCCGGC ■ AGCTGATTGG ACGCGGGCCG 6 0 

CCCCACCCCC TGGCCGTCGC GGGACCCGCA GGACTGAGAC C ATG GAG GCG GTG 113 

Met Glu Ala Val 
1 

GCG GTG GCC GCG GCG GTG GGG GTC CTT CTC CTG GCC GGG GCC GGG GGC 161 
Ala Val Ala Ala Ala Val Gly Val Leu Leu Leu Ala Gly Ala Gly Gly 
5 10 15 20 

GCG GCA GGC GAC GAG GCC CGG GAG GCG GCG GCC GTG CGG GCG CTC GTG 2 09 

Ala Ala Gly Asp Glu Ala Arg Glu Ala Ala Ala Val Arg Ala Leu Val 
25 30 35 

GCC CGG CTG CTG GGG CCA GGC CCC GCG GCC GAC TTC TCC GTG TCG GTG 2 57 

Ala Arg Leu Leu Gly Pro Gly Pro Ala Ala Asp Phe Ser Val Ser Val 
40 45 50 

GAG CGC GCT CTG GCT GCC AAG CCG GGC TTG GAC ACC TAC AGC CTG GGC 3 05 

Glu Arg Ala Leu Ala Ala Lys Pro Gly Leu Asp Thr Tyr Ser Leu Gly 
55 60 65 

GGC GGC GGC GCG GCG CGC GTG CGG GTG CGC GGC TCC ACG GGC GTG GCG 3 53 

Gly Gly Gly Ala Ala Arg Val Arg Val Arg Gly Ser Thr Gly Val Ala 
70 75 80 

GCC GCC GCG GGG CTG CAC CGC TAC CTG CGC GAC TTC TGT GGC TGC CAC 4 01 

Ala Ala Ala Gly Leu His Arg Tyr Leu Arg Asp Phe Cys Gly Cys His 
85 90 95 100 

GTG GCC TGG TCC GGC TCT CAG CTG CGC CTG CCG CGG CCA CTG CCA GCC 44 9 

Val Ala Trp Ser Gly Ser Gin Leu Arg Leu Pro Arg Pro Leu Pro Ala 
105 110 115 

GTG CCG GGG GAG CTG ACC GAG GCC ACG CCC AAC AGG TAC CGC TAT TAC 4 97 

Val Pro Gly Glu Leu Thr Glu Ala Thr Pro Asn Arg Tyr Arg Tyr Tyr 
120 125 130 

CAG AAT GTG TGC ACG CAA AGC TAC TCC TTC GTG TGG TGG GAC TGG GCC 54 5 

Gin Asn Val Cys Thr Gin Ser Tyr Ser Phe Val Trp Trp Asp Trp Ala 
13 5 140 145 

CGC TGG GAG CGA GAG ATA GAC TGG ATG GCG CTG AAT GGC ATC AAC CTG 5 93 

Arg Trp Glu Arg Glu lie Asp Trp Met Ala Leu Asn Gly He Asn Leu 
150 155 160 
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GCA CTG GCC TGG AGC GGC CAG GAG GCC ATC TGG CAG CGG GTG TAC CTG 
Ala Leu Ala Trp Ser Gly Gin Glu Ala He Trp Gin Arg Val Tyr Leu 
155 170 175 180 

GCC TTG GGC CTG ACC CAG GCA GAG ATC AAT GAG TTC TTT ACT GGT CCT 
Ala Leu Gly Leu Thr Gin Ala Glu He Asn Glu Phe Phe Thr Gly Pro 
185 190 195 

GCC TTC CTG GCC TGG GGG CGA ATG GGC AAC CTG CAC ACC TGG GAT GGC 
Ala Phe Leu Ala Trp Gly Arg Met Gly Asn Leu His Thr Trp Asp Glv 
200 2 o5 2 10 

CCC CTG CCC CCC TCC TGG CAC ATC AAG CAG CTT TAC CTG CAG CAC CGG 
Pro Leu Pro Pro Ser Trp His He Lys Gin Leu Tyr Leu Gin His Arg 
215 220 225 

GTC CTG GAC CAG ATG CGC TCC TTC GGC ATG ACC CCA GTG CTG CCT GCA 
Val Leu Asp Gin Met Arg Ser Phe Gly Met Thr Pro Val Leu Pro Ala 
230 235 240 

TTC GCG GGG CAT GTT CCC GAG GCT GTC ACC AGG GTG TTC CCT CAG GTC 
Phe Ala Gly His Val Pro Glu Ala Val Thr Arg Val Phe Pro Gin Val 
245 250 ->cc 

255 260 

AAT GTC ACG AAG ATG GGC AGT TGG GGC CAC TTT AAC TGT TCC TAC TCC 
Asn Val Thr Lys Met Gly Ser Trp Gly His Phe Asn Cys Ser Tyr Ser 
2S5 270 275 

TGC TCC TTC CTT CTG GCT CCG GAA GAC CCC ATA TTC CCC ATC ATC GGG 
Cys Ser Phe Leu Leu Ala Pro Glu Asp Pro He Phe Pro He He Gly 
280 285 290 



TAT GGG GCC GAC ACT TTC AAT GAG ATG CAG CCA CCT TCC TCA GAG CCC 
Tyr Gly Ala Asp Thr Phe Asn Glu Met Gin Pro Pro Ser Ser Glu Pro 
310 315 320 

TCC TAC CTT GCC GCA GCC ACC ACT GCC GTC TAT GAG GCC ATG ACT GCA 
Ser Tyr Leu Ala Ala Ala Thr Thr Ala Val Tyr Glu Ala Met Thr Ala 

325 330 -a OC 

JU 335 34 0 

GTG GAT ACT GAG GCT GTG TGG CTG CTC CAA GGC TGG CTC TTC CAG CAC 
Val Asp Thr Glu Ala Val Trp Leu Leu Gin Gly Trp Leu Phe Gin His 
345 3 5 o " 355 

CAG CCG CAG TTC TGG GGG CCC GCC CAG ATC AGG GCT GTG CTG GGA GCT 
Gin Pro Gin Phe Trp Gly Pro Ala Gin He Arg Ala Val Leu Gly Ala 
360 365 370 

GTG CCC CGT GGC CGC CTC CTG GTT CTG GAC CTG TTT GCT GAG AGC CAG 
Val Pro Arg Gly Arg Leu Leu Val Leu Asp Leu Phe Ala Glu Ser Gin 
375 3 8 o 385 



641 



689 



737 



785 



833 



881 



929 



977 



AGC CTC TTC CTG CGA GAG CTG ATC AAA GAG TTT GGC ACA GAC CAC ATC 1025 
Ser Leu Phe Leu Arg Glu Leu He Lys Glu Phe Gly Thr Asp His He 
295 300 305 



1073 



1121 



1169 



1217 



1265 
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CCT GTG TAT ACC CGC ACT GCC TCC TTC CAG GGC CAG CCC TTC ATC TGG 
Pro Val Tyr Thr Arg Thr Ala Ser Phe Gin Gly Gin Pro Phe lie Trp 
390 395 400 



TGC ATG CTG CAC AAC TTT GGG GGA AAC CAT GGT CTT TTT GGA GCC CTA 
Cys Met Leu His Asn Phe Gly Gly Asn His Gly Leu Phe Gly Ala Leu 
405 410 415 420 



GAG GCT GTG AAC GGA GGC CCA GAA GCT GCC CGC CTC TTC CCC AAC TCC 
Glu Ala Val Asn Gly Gly Pro Glu Ala Ala Arg Leu Phe Pro Asn Ser 
425 430 435 



ACC ATG GTA GGC ACG GGC ATG GCC CCC GAG GGC ATC AGC CAG AAC GAA 
Thr Met Val Gly Thr Gly Met Ala Pro Glu Gly lie Ser Gin Asn Glu 
440 445 450 



GTG GTC TAT TCC CTC ATG GCT GAG CTG GGC TGG CGA AAG GAC CCA GTG 
Val Val Tyr Ser Leu Met Ala Glu Leu Gly Trp Arg Lys Asp Pro Val 
455 460 465 



CCA GAT TTG GCA GCC TGG GTG ACC AGC TTT GCC GCC CGG CGG TAT GGG 
Pro Asp Leu Ala Ala Trp Val Thr Ser Phe Ala Ala Arg Arg Tyr Gly 
470 475 480 



GTC TCC CAC CCG GAC GCA GGG GCA GCG TGG AGG CTA CTG CTC CGG AGT 
Val Ser His Pro Asp Ala Gly Ala Ala Trp Arg Leu Leu Leu Arg Ser 
485 490 495 500 



E , 3, 
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GTG TAG AAC TGC TCC GGG GAG GCC TGC AGG GGC CAC AAT CGT AGC CCG 164 9 
Val Tyr Asn Cys Ser Gly Glu Ala Cys Arg Gly His Asn Arg Ser Pro 
505 510 515 

CTG GTC AGG CGG CCG TCC CTA CAG ATG AAT ACC AGC ATC TGG TAC AAC 16 9 7 
Leu Val Arg Arg Pro Ser Leu Gin Met Asn Thr Ser lie Trp Tyr Asn 
520 525 530 

CGA TCT GAT GTG TTT GAG GCC TGG CGG CTG CTG CTC ACA TCT GCT CCC 174 5 
Arg Ser Asp Val Phe Glu Ala Trp Arg Leu Leu Leu Thr Ser Ala Pro 
535 540 545 

TCC CTG GCC ACC AGC CCC GCC TTC CGC TAC GAC CTG CTG GAC CTC ACT 17 93 
Ser Leu Ala Thr Ser Pro Ala Phe Arg Tyr Asp Leu Leu Asp Leu Thr 
550 , 555 560 

CGG CAG GCA GTG CAG GAG CTG GTC AGC TTG TAC TAT GAG GAG GCA AGA 1841 
Arg Gin Ala Val Gin Glu Leu Val Ser Leu Tyr Tyr Glu Glu Ala Arg 
565 570 575 580 

AGC GCC TAC CTG AGC AAG GAG CTG GCC TCC CTG TTG AGG GCT GGA GGC 18 8 9 
Ser Ala Tyr Leu Ser Lys Glu Leu Ala Ser Leu Leu Arg Ala Gly Gly 
585 590 595 

GTC CTG GCC TAT GAG CTG CTG CCG GCA CTG GAC GAG GTG CTG GCT AGT 193 7 
Val Leu Ala Tyr Glu Leu Leu Pro Ala Leu Asp Glu Val Leu Ala Ser 
600 605 610 

GAC AGC CGC TTC TTG CTG GGC AGC TGG CTA GAG CAG GCC CGA GCA GCG 198 5 
Asp Ser Arg Phe Leu Leu Gly Ser Trp Leu Glu Gin Ala Arg Ala Ala 
615 620 625 

GCA GTC AGT GAG GCC GAG GCC GAT TTC TAC GAG CAG AAC AGC CGC TAC 2 03 3 
Ala Val Ser Glu Ala Glu Ala Asp Phe Tyr Glu Gin Asn Ser Arg Tyr 
630 635 640 

CAG CTG ACC TTG TGG GGG CCA GAA GGC AAC ATC CTG GAC TAT GCC AAC 2 081 
Gin Leu Thr Leu Trp Gly Pro Glu Gly Asn lie Leu Asp Tyr Ala Asn 
645 650 655 660^ 

AAG CAG CTG GCG GGG TTG GTG GCC AAC TAC TAC ACC CCT CGC TGG CGG 212 9 
Lys Gin Leu Ala Gly Leu Val Ala Asn Tyr Tyr Thr Pro Arg Trp Arg 
665 670 675 

CTT TTC CTG GAG GCG CTG GTT GAC AGT GTG GCC CAG GGC ATC CCT TTC 2177 
Leu Phe Leu Glu Ala Leu Val Asp Ser Val Ala Gin Gly lie Pro Phe 
680 685 690 

CAA CAG CAC CAG TTT GAC AAA AAT GTC TTC CAA CTG GAG CAG GCC TTC 22 2 5 
Gin Gin His Gin Phe Asp Lys Asn Val Phe Gin Leu Glu Gin Ala Phe 
695 700 705 

GTT CTC AGC AAG CAG AGG TAC CCC AGC CAG CCG CGA GGA GAC ACT GTG 2 2 73 
Val Leu Ser Lys Gin Arg Tyr Pro Ser Gin Pro Arg Gly Asp Thr Val 
710 715 720 
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GAC CTG GCC AAG AAG ATC TTC CTC AAA TAT TAG CCC GGC TGG GTG GCC 2321 

Asp Leu Ala Lys Lys lie Phe Leu Lys Tyr Tyr Pro Gly Trp Val Ala 
725 730 735 740 

GGC TCT TGG TGATAGATTC GCCACCACTG GGCCTTGTTT TCCGCTAATT 23 70 

Gly Ser Trp 



CCAGGGCAGA TTCCAGGGCC CAGAGCTGGA CAGACATCAC AGGATAACCC AGGCCTGGGA 243 0 

GGAGGCCCCA CGGCCTGCTG GTGGGGTCTG ACCTGGGGGG ATTGGAGGGA AATGACCTGC 24 90 

CCTCCACCAC CACCCAAAGT GTGGGATTAA AGTACTGTTT TCTTTCCACT TAAAAAAAAA 2 55 0 

AAAAAAGTCG AGCGGCCGCG AATTC 2 575 



(2) INFORMATION FOR SEQ ID NO : 2 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 74 3 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: protein 

(ix) FEATURE: 

(A) NAME /KEY : Potent ially-glycosylated Asn site, 

(B) LOCATION: 261 

(ix) FEATURE: 

(A) NAME / KEY : Pot entially-glycosylated Asn site, 

(B) LOCATION: 2 72 

(ix) FEATURE: 

(A) NAME/KEY: Potent ially-glycosylated Asn site, 

(B) LOCATION: 435 

(ix) FEATURE: 

(A) , NAME/KEY: Potent ially-glycosylated Asn site, 

(B) LOCATION: 5 03 

• (ix) FEATURE: 

(A) NAME/KEY: Potentially-glycosylated Asn site, 

(B) LOCATION: 513 

(ix) FEATURE: 

(A) NAME / KEY : Potentially-glycosylated Asn site, 

(B) LOCATION: 52 6 

(ix) FEATURE: 

(A) NAME / KEY : Potentially-glycosylated Asn site, 

(B) LOCATION: 53 2 



(xi) 



SEQUENCE DESCRIPTION: SEQ ID NO : 2 : 
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Met Glu Ala Val Ala Val Ala Ala Ala Val Gly Val Leu Leu Leu Ala 
15 10 15 

Gly Ala Gly Gly Ala Ala Gly Asp Glu Ala Arg Glu Ala Ala Ala Val 
20 25 30 

Arg Ala Leu Val Ala Arg Leu Leu Gly Pro Gly Pro Ala Ala Asp Phe 
35 40 45 

Ser Val Ser Val Glu Arg Ala Leu Ala Ala Lys Pro Gly Leu Asp Thr 
50 55 60 

Tyr Ser Leu Gly Gly Gly Gly Ala Ala Arg Val Arg Val Arg Gly Ser 
65 70 75 80 

Thr Gly Val Ala Ala Ala Ala Gly Leu His Arg Tyr Leu Arg Asp Phe 
85 90 95 

Cys Gly Cys His Val Ala Trp Ser Gly Ser Gin Leu Arg Leu Pro Arg 
100 105 110 

Pro Leu Pro Ala Val Pro Gly Glu Leu Thr Glu Ala Thr Pro Asn Arg 
115 120 125 

Tyr Arg Tyr Tyr Gin Asn Val Cys Thr Gin Ser Tyr Ser Phe Val Trp 
130 135 140 

Trp Asp Trp Ala Arg Trp Glu Arg Glu lie Asp Trp Met Ala Leu Asn 
145 150 155 160 

Gly lie Asn Leu Ala Leu Ala Trp Ser Gly Gin Glu Ala lie Trp Gin 
165 170 175 

Arg Val Tyr Leu Ala Leu Gly Leu Thr Gin Ala Glu lie Asn Glu Phe 
180 185 190 

Phe Thr Gly Pro Ala Phe Leu Ala Trp Gly Arg Met Gly Asn Leu His 
195 200 205 

Thr Trp Asp Gly Pro Leu Pro Pro Ser Trp His lie Lys Gin Leu Tyr 
210 215 220 

Leu Gin His Arg Val Leu Asp Gin Met Arg Ser Phe Gly Met Thr Pro 
225 230 235 240 

Val Leu Pro Ala Phe Ala Gly His Val Pro Glu Ala Val Thr Arg Val 
245 250 255 

Phe Pro Gin Val Asn Val Thr Lys Met Gly Ser Trp Gly His Phe Asn 
260 265 270 



Cys Ser Tyr Ser Cys Ser Phe Leu Leu Ala Pro Glu Asp Pro lie Phe 
275 280 285 
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Pro lie He Gly Ser Leu Phe Leu Arg Glu Leu He Lys Glu Phe Gly 
290 295 300 

Thr Asp His He Tyr Gly Ala Asp Thr Phe Asn Glu Met Gin Pro Pro 
305 310 315 320 

Ser Ser Glu Pro Ser Tyr Leu Ala Ala Ala Thr Thr Ala Val Tyr Glu 
325 330 335 

Ala Met Thr Ala Val Asp Thr Glu Ala Val Trp Leu Leu Gin Gly Trp 
340 345 350 

Leu Phe Gin His Gin Pro Gin Phe Trp Gly Pro Ala Gin He Arg Ala 
355 360 365 

Val Leu Gly Ala Val Pro Arg Gly Arg Leu Leu Val Leu Asp Leu Phe 
370 375 380 

Ala Glu Ser Gin Pro Val Tyr Thr Arg Thr Ala Ser Phe Gin Gly Gin 
385 390 395 400 

Pro Phe He Trp Cys Met Leu His Asn Phe Gly Gly Asn His Gly Leu 
405 410 415 

Phe Gly Ala Leu Glu Ala Val Asn Gly Gly Pro Glu Ala Ala Arg Leu 
420 425 430 

Phe Pro Asn Ser Thr Met Val Gly Thr Gly Met Ala Pro Glu Gly He 
435 440 445 

Ser Gin Asn Glu Val Val Tyr Ser Leu Met Ala Glu Leu Gly Trp Arg 
450 455 460 

Lys Asp Pro Val Pro Asp Leu Ala Ala Trp Val Thr Ser Phe Ala Ala 
465 470 475 480 

Arg Arg Tyr Gly Val Ser His Pro Asp Ala Gly Ala Ala Trp Arg Leu 
485 490 495 

Leu Leu Arg Ser Val Tyr Asn Cys Ser Gly Glu Ala Cys Arg Gly His 
500 505 510 

Asn Arg Ser Pro Leu Val Arg Arg Pro Ser Leu Gin Met Asn Thr Ser 
515 520 525 

He Trp Tyr Asn Arg Ser Asp Val Phe Glu Ala Trp Arg Leu Leu Leu 
530 535 540 

Thr Ser Ala Pro Ser Leu Ala Thr Ser Pro Ala Phe Arg Tyr Asp Leu 
545 550 555 560 

Leu Asp Leu Thr Arg Gin Ala Val Gin Glu Leu Val Ser Leu Tyr Tyr 
565 570 575 



Glu Glu Ala Arg Ser Ala Tyr Leu Ser Lys Glu Leu Ala Ser Leu Leu 
580 585 590 



Arg Ala Gly Gly 
595 

Val Leu Ala Ser 
610 

Ala Arg Ala Ala 
625 

Asn Ser Arg Tyr 



Asp Tyr Ala Asn 
660 

Pro Arg Trp Arg 
675 

Gly lie Pro Phe 
690 

Glu Gin Ala Phe 
705 

Gly Asp Thr Val 



Gly Trp Val Ala 
740 



Val Leu Ala Tyr 
600 

Asp Ser Arg Phe 
615 

Ala Val Ser Glu 
630 

Gin Leu Thr Leu 
645 

Lys Gin Leu Ala 



Leu Phe Leu Glu 
680 

Gin Gin His Gin 
695 

Val Leu Ser Lys 
710 

Asp Leu Ala Lys 
725 

Gly Ser Trp 
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Glu Leu Leu Pro 



Leu Leu Gly Ser 
620 

Ala Glu Ala Asp 
635 

Trp Gly Pro Glu 
650 

Gly Leu Val Ala 
665 

Ala Leu Val Asp 



Phe Asp Lys Asn 
700 

Gin Arg Tyr Pro 
715 

Lys lie Phe Leu 
730 



Ala Leu Asp Glu 
605 

Trp Leu Glu Gin 



Phe Tyr Glu Gin 
640 

Gly Asn lie Leu 
655 

Asn Tyr Tyr Thr 
670 

Ser Val Ala Gin 
685 

Val Phe Gin Leu 



Ser Gin Pro Arg 
720 

Lys Tyr Tyr Pro 
735 



(2) INFORMATION FOR SEQ ID NO : 3 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 10380 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS : single 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: DNA (genomic) 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(viii) POSITION IN GENOME: 

(A) CHROMOSOME /SEGMENT: Chromosome 17 

(ix) FEATURE: 

(A) NAME / KEY : exon 1 

(B) LOCATION: 990. .1372 

(ix) FEATURE : 

(A) NAME/KEY: exon 2 

(B) LOCATION: 2115 . . 2262 
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(ix) FEATURE: 

(A) NAME /KEY : exon 3 

(B) LOCATION: 3056.. 3202 

(ix) FEATURE: 

(A) NAME /KEY : exon 4 

(B) LOCATION: 3387.. 3472 

(ix) FEATURE: 

(A) NAME/KEY: exon 5 

(B) LOCATION: 5667.. 5923 

(ix) FEATURE : 

(A) NAME / KEY : exon 6 

(B) LOCATION: 7745.. 8955 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 3 : 

ATAATGAGCA GTGAGGACGA TCAGAGGTCA CCTTCCTGTC TTGGTTTTGG CAGGTTTTGA 6 0 

CCAGTTTCTT TGCTGCATTC TGTTTTATCA GCGGGGTCTT GTGACCTTTT ATCTTGTGCT 12 0 

GACCTCCTGT CTCATCCTGT GACGAAGGCC TAACCTCCTG GGAATTCAGC CCAGCAGGTC 18 0 

TCTGCCTCAT TTTACCCAGC CCCTGTTCAA GATGGAGTCG CTCTGGTTGG AAACTTCTGA 24 0 

CAAAATGACA GCTCCTGTTA TGTTGCTGCT GCTGCCGCCA ATGG AC AG C C TTTAACGTGC 3 00 

CCGCCAGCCC TGCTCCACCG CCGGCCTGGG CTCACATGGC CCCATCCCTC CTCGAACCTC 36 0 

CTAGCCTGTT AGTTACTCAA ATCTGCAAGC TCTCTGCCTT CTCAGGGCCT TCAATAAATG 42 0 

CATTTCTTCT GTCTGGAAGG CTCTTCCTTT CCCTCTTCTA GCCAATTCCT ATTCATCCCT 48 0 

GAGTTTCAGA TTAAAAGTCA CTTCCTTTGG AAACCTTACT TCGCTACTTC GCTACTTACT 54 0 

GCACTACTTC GCAGCATCAC AACTATGATG GAAATCCTTA CTTACGTTAA ATATCTGGTT 6 00 

TCTAGGTCAC CTCCCTGACG GGGACGGTAG GGACCGTCTT CTCGTTCATC AGTAGGGAAG 66 0 

TAGCTATGGC AGTGCCTGAT ACAAAATAAA CTCCAAATGT GTATTTATTA GATGGTTGGA 72 0 

TGGAAGTTAT TTGCGTGTGA AAGCGCGTTT TACCCGAAGG CGCTCTGTGA GGGCCAGCGG 78 0 

GTCCCCTTCG GCCCTGGAGC CGGGGTCACA CGCTCCCCAC CGCGTGCGGT CACGAGACGC 84 0 

CCCCAAGGGA GTATCCTGGT AC C CGGAAG C CGCGACTCCT GGCCCTGAGC CCGGGCTTAG 90 0 

CCTTCGGGTC CACGTGGCCG GAGCCGGCAG CTGATTGGAC GCGGGCCGCC CCACCCCCTG 96 0 

GCCGTCGCGG GACCCGCAGG ACTGAGACCA TGGAGGCGGT GGCGGTGGCC GCGGCGGTGG 102 0 

GGGTCCTTCT CCTGGCCGGG GCCGGGGGCG CGGCAGGCGA CGAGGCCCGG GAGGCGGCGG 108 0 

CCGTGCGGGC GCTCGTGGCC CGGCTGCTGG GGCCAGGCCC CGCGGCCGAC TTCTCCGTGT 114 0 



CGGTGGAGCG CGCTCTGGCT GCCAAGCCGG 
GCGCGGCGCG CGTGCGGGTG CGCGGCTCCA 
GCTACCTGCG CGACTTCTGT GGCTGCCACG 
CGCGGCCACT GCCAGCCGTG CCGGGGGAGC 
CCGAAGCTTC CCCGCGTCCG CCCGAGGCGC 
ATCGGGAGGC TG AG CGGGGA GCGCTGGCCG 
GTGTGGCCTT GAGCCAGCCA CTCTGCCTTT 
GAAAGAAGAC GCCTACCGTG CAGTGTTATT 
TTTGTGGTGC ACAATTGGTG ATGAGTGAAT 
GAACCTGCGG ACTGAGGAAG GACGCCTCCA 
CCTGCCACAC TATGGAGTGA TGTGTTCACA 
TGTGGGGGCA GGGATTCCCC GTTCCAGGAA 
GTGGCATGAA AGTGGAATAT GCCACCCAAA 
TGCAGGGGAC GAGTGCCTCA GAAGCCCAGC 
TGGGTCCCAG TGTGCAGCAG AAGGGCCGAG 
GGGATGGGGG ATTTGTTCCA GGGCCGTGGA 
GACACTGCCC GCAGGTACCG CTATTACCAG 
TGGTGGGACT GGGCCCGCTG GGAGCGAGAG 
CTGGCACTGG CCTGGAGCGG CCAGGAGGCC 
TTCCCCACCC TCCTCTATGG CGGGAGCCAC 
GGCGCAGTGT CTCTCTCTAG AAGTGCTTTC 
ACTGAGGCTT CCGGCCGGGC GCGGTGGCTC 
TAGGCGGGCG GATCAGGAGT TCAGGAGATC 
CGTCTCTACT AAAATACAAA GAAATAGCAA 
AAAAAAAAAA AAAAAAACTG AGGCTTCCAG 
CCCTATGCAG CCAATCACCT GGTCCCTTGG 
TCCCTGGAAG CTTCTGTGCT GCAATGGCTG 
TTGCCCTGCC CTTCCATCTG GCACTCTTGC 
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GCTTGGACAC CTACAGCCTG GGCGGCGGCG 12 0 0 
CGGGCGTGGC GGCCGCCGCG GGGCTGCACC 12 6 0 
TGGCCTGGTC CGGCTCTCAG CTGCGCCTGC 132 0 
TGACCGAGGC CACGCCCAAC AGGTACCGCC 13 8 0 
TTACCCCCTC CCGGAGCCGC TGCCACCCAA 144 0 
GAAGGCCCAG CTGCGCCGCC TCCAGCAGCT 15 0 0 
CAGAGCCTCG GCTGGCCCAC CTGAAAAACG 15 6 0 
GTGAGGATTT GCACGATGAT GGGCATAGAA 16 2 0 
TTTCTTGCCT TCCTCCCCCA CCTTCTCTTT 16 8 0 
TCCCCCACCC TACAGGCCTG TGTTCCAGCG 174 0 
CAGCTGTCCT CCCCTGCCCA TCTGTTAGAC 18 0 0 
AACACCGTGC AG AGGAGGG G CTCTGGCAGT 1860 
TACCCGCCAG GCTAGAGGGC CCTGGGAGAG 192 0 
CCCGGTACCT GGTCTCAGCT CCACCTGGGG 198 0 
TTTGGAGCCC CTCCCCTCTC CTCTAGGTGG 2 04 0 
CCCTCCAGGG TGGGATGCGC CCCTGCTCAT 2100 
AATGTGTGCA CGCAAAGCTA CTCCTTCGTG 2160 
AT AGAC TGG A TGGCGCTGAA TGGCATCAAC 2 22 0 
ATCTGGCAGC GGGTGCGTGC CCACTGTCCC 22 8 0 
CGTAGGTGTT TTCACCCGCC CCCCAGCATG 2 340 
AGCGTGCACA GTGGCTTGGG CCTCCTAAAA 24 0 0 
ACGCCTGTCA TCCCAGCACT TCGGGAGGCC 24 6 0 
GAGACCATCC TGGCCAACAT TGTGAAACCC 2 52 0 
CCTGGGCAAC AGAGCGAGAC TCTGTCTAAA 258 0 
TTTGAGGAGT GGGGCTCCTT CCCCCATCTC 2 64 0 
ATC CAACTCA TGGGCAGCTC TAGATCTGCC 2 700 
CTCCAGGCTC TGCTTAAGCT CTTCACACAG 2 76 0 
TCCATGAAGC CTTCTAAGGC CTTCCTGTTG 2 82 0 
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GGGGAAAGCC CCTTTGTGCC CCATCTCCTC ACCCATGCGA CAAAGGCAAC ACAGTGAACT 2 880 
CACCTACTCA CAGGTCTCTT TCCTCTGGGC TGTGGGCTCC TTGATGGCAG CGTTCGGATT 2 94 0 
TTGTCTCAGT AGCCCTAGCA CCCAGCACAA AGAAGCAATG AGTGAATGGT TGTTGAATGA 3 000 
ATGAATGAAT GAATGAAGAT GAATATATTT CTATGTGTGG GCCCTTCTTC CTCAGGTGTA 3 06 0 
CCTGGCCTTG GGCCTGACCC AGGCAGAGAT CAATGAGTTC TTTACTGGTC CTGCCTTCCT 312 0 
GGCCTGGGGG CGAATGGGCA ACCTGCACAC CTGGGATGGC CCCCTGCCCC CCTCCTGGCA 3180 
CATCAAGCAG CTTTACCTGC AGGTAAAAGG ATGGAAAAGG GAAGGGGCAG AATCGGTGAT 324 0 
AGATGGTCAT GGGCCCAGGA AGGGTGGTAT TAGGCCGGCC CCAGGGCTCT TAACTGAGGC 3 3 00 
GGGGGGCTGC GTGTATCCTG GGAGATGAGG GCCTTCTCAT AGGACAGCAG TGGCCATGCT 336 0 
CACCACCCTT CCTTCTGTTC CTCCAGCACC GGGTCCTGGA CCAGATGCGC TCCTTCGGCA 342 0 
TGACCCCAGT GCTGCCTGCA TTCGCGGGGC ATGTTCCCGA GGCTGTCACC AGGTGAGGTT 34 8 0 
CCGCTCACCC CCTCCACTTA GCTCAGAGAG GGAATTTTAT TCCCTTCTAG AACATGACTT 3 54 0 
AAAAACTTAA GCTCTGGGCC GGGCGCAGTG GCTCACGCCT GTAATCCCAG CACTTTGGGA 3600 
GGCCGAGTTG GGCGGATCAC CTGAGGTCAG GAGTTCGAGA CCAGCCTGGC CAACATGGTG 3 66 0 
AAACCCTGTC TCTACTAAAA ATATAAAAAT TAGCTGGGCA TGGTGGCACG CGCCTGTAAT 3 72 0 
CCCATCTACT TAGGAGGCTG AGACAGGAGA ATTGCTTAAA CCTGGGAGGC AGACGTTGCA 3 78 0 
GTGAGTCAAG ATCACGCCAT TGCACTCCAG CCTGGGTGAC GAGCGAAACT CTGTCTCAAA 3 84 0 
CAAACAAACA AGCTCTGGAC GTAGGCCTGG GTTTGATTTC TGACTCTGCT AC TAATTAG C 3 90 0 
TGTGTGACTT CGGGCAGATG ACATGACTGC TCTGTGCCTC AGTTTCCTTA CTTGTAAAAT 3 96 0 
GGGATCTCTA CCCACTTCGC TGTAGGGTTT GTAATTATCT CTCGATCTAT CTGTGACTTT 4 02 0 
GCACAGAGTG CTAGCAAATG GCAGCCCTTG GGAGTGG CAG CAGGGGTGCT CCAGTGTCCC 4 08 0 
TTGTCCCTCC TGTTCCTCTG TGCTTCCCAG CCATCCTCTC ACATGTGGTT GGGAAAAGTC 414 0 
TTCAAGGCTC ACCTGAGACC TCCCCTCCTT CAGGAAGCCT TGCTAGTGCC CCGCATGACC 42 0 0 
TCCTTTGCAC CTGCTAATGT CTGGCTCCCA TACTCTCGTA GGACTTAATG CATGCCAGTG 42 6 0 
GCCTCCCTGC CCGCCTCTTT GCCCCCATCA CCAGGTGGCA GGAAACTCAC TCATTCATTC 43 2 0 
AATAAACTTG GTCCAGCTGT CTGAGGCTGC CAGAACTGGC TGTGCTGGGT CCTGGGAGGC 43 8 0 
GGCAAGAAAG GTGCCCAAGG GCTTACCCCT GATAGGAGAG ATATGTTGGC TGAAGGATAC 444 0 
AATGTGGGGA CAAGGACAGG AATATATGTG GGTTCCGCTC TCCTCTGCCG GGAGAGAGGG 4 5 00 



GCAGGAAGGG CTCAGGGCAG AGCCCAGCCT 

TGGCTAATGC TTGTAATCCT AGCGTTTTGG 

AGGAGTTAAA GACCAGCCTG GCCAACATGG 

ATTAGCCAGG CGTGGTGGCG GGCTCCTGTA 

GAATCTCTTG AAGCCAGGGG CCAGAGACTG 

AGCCTGGGTG ACAGAGTGAG ACTCCGTCTC 

TAAACACCTC ATGTTCTCAC T C AT AGTGGG 

GAAGGGGAAC ATCACACACC GGGGCCTTTC 

TTGGGACAGA TACTTAATGC ATGCGGGGCT 

GCAAACCACC ATGGCACATG TATACCTATG 

AACTGAAAGT ATAATTAAAA AAAAAAAAAA 

CCCAGCACTT TGGGAGGCCG AGACGGGCGG 

GGCTAACACA GTGAAACTCA GTCTCTACTA 

CGGGCACCTG TAGTCCCAGC TACTAGGGAG 

AGGCAGAGCT TGCAGTGAGC TGAGAATGCG 

AGACTCTGCC TCAAAAAAAA AAAAAAAAAG 

CACAGGAAGG GGAGAGATAG TGAAAGTTTT 

AGGAC TGTAG GCAGAGAGCA TAGCCTGTAC 

TGTAGAGAAG TTGGCAAGGC TGTTGAACAC 

ATATCTGAGC TTTTGCTCCC CACTAGGGTG 

AGTTGGGGCC ACTTTAACTG TTCCTACTCC 

ATATTCCCCA TCATCGGGAG CCTCTTCCTG 

CACATCTATG GGGCCGACAC TTTCAATGAG 

CTTGCCGCAG CCACCACTGC CGTCTATGAG 

GGTGGGAGAG CCCCCCAGAC CCTCAAAAAG 

CAGAGGGACT GGAATAATGC CTCGCCATAA 

TGTACACATG CGTTGTCTCA GTGAATCCCA 

ACAACCTGGG TCACACCTCG CGCTCCTATT 



) 
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TGAAAAATGA GTGTTGCTTG GACGGACGCT 4 56 0 
GAGGCTGAGG CGTATGGATC ACCTGCGGTC 462 0 
CGAAACCCCA TCTCTACTAA AAGTACAAAA 46 8 0 
ATCCCAGCTA CTCGGTAGGC TGAGGCATGA 4 74 0 
CAGTGAGCCG AGATCACACC ACTTCACTCC 48 00 
AAAAAAAAAA AAAAAAAAAG GAAAGAAAAT 4860 
AGTTGAACAA TGAGAACAAC ATGGACACAG 4 92 0 
GCGGTGTGGG GGTCAAGGGG AGGAGTAGCA 4 98 0 
GAAAAC CTAG ATGATGGGTT GATGGGTGCA 5 04 0 
CAACAAACCT GCATGTTCTG CACAGAACTG 5100 
AAGCTGGGTG CGGTGGCCCA CACCTGTAAT 516 0 
ATCACAAGGT CAGCAGATCG AGACCATCCT 52 2 0 
AAAATACAAA AAATTAGCCG GGTGTGGTGG 52 8 0 
GCTGAGGCAG G AGAATGG C A TGAACCTGGG 53 4 0 
CCACTGCACT CCAGCCTGGG GGACAGAGTG 54 0 0 
AAAGAAAAAG GAGCGTTGCT TGTTTCAGGC 54 6 0 
TCAGAGAAGG TGGCCAGGGA AGGAGAAGAA 5 52 0 
AAAGCCATAG AGGCAAGAGA AACCAGGAGC 558 0 
TATGGTGAAC ACTATGGCGG CTTCCATGAA 564 0 
TTCCCTCAGG TCAATGTCAC GAAGATGGGC 5 700 
TGCTCCTTCC TTCTGGCTCC GGAAGACCCC 5 76 0 
CGAGAG CTGA TCAAAGAGTT TGGCACAGAC 5 82 0 
ATGCAGCCAC CTTCCTCAGA GCCCTCCTAC 58 8 0 
GCCATGACTG CAGGTACAGT GCCTGGGTGG 5 94 0 
AAGGGAGTAG CAGATGTCAG TAGGGGTAGG 6000 
CACACAGTAC TTTATAGTTT ACCAAGCACG 6 06 0 
CTGTGGTTGA GAGGTGAGCT CTGGAAGCCA 612 0 
TCCTGGCCGT GTGACTTATG ACTCATGACC 618 0 



# 
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TCCTTCCCAG TGTCTCGTTT GCTTTTCCTG TAAACTGGGA CTACCTCATA GGTAGAATAA 6 24 0 

CGCCTGGCCC AG AG C AAAGG CCACTAAGAG CTAGCTATGA ACAAGGATTT TGTTTCATCT 63 00 

CTGCGTGGTT GCTGAAGTAG GCACTGCAGG CAGGAGGTGA GTGGATGTGC C TAAAGGC AC 63 6 0 

TAAGTGCGCA TCCTGCTACA AAACTGTGAA GCCAGGGCTC CTTCCTGCCA CTTAAAGGAG 642 0 

GAGTGGAGCA GAGGGCGCCC AAGTCAGGAA TGACTTAGTG GAGAGGCGTC TGTGTTGGCC 64 8 0 

AGGAAGGGAA CAGATCAGCT CAGCCTTTCT TGAGCAGTAC TGCTCCAAGT GTG AC C C AAA 6 54 0 

ACCAGCAGCA GCAGCAGCAG CAGCCCGAGC TGTGAGATGG CAAATTCTCA GGCCCTACCC 66 0 0 

AAGACCTGAA GGAGAAGCTA CATTTTTTTT TTTTTTGAGA CAGATTTCAC TCTGTTGCTG 6 66 0 

AGGCTGGAGC ACAGTGGCAC AATCTCATCT CACTGCAACC TTCGTCTCCT AGGTTCAAGC 6 72 0 

;=> GATTCTCCTG CCTCAGCCTC CCGAGTAGCT GGG AC TAT AG GCACCCGCCA CCACGCCCGG 6 78 0 

"~ CAATTTTTGT TTGTTTTGAG ATAGAGTCTC GCTCTGTCAC CCAGGCTGGA GTGCAGTGGC 6 84 0 

W ' ACGATCTCAG TTCACTGCAA CCTCTGCTTC CTGAGTTCAA GCGATTCTCC TGCCTCAGCC 6 90 0 

i|1 TCCTGAGTAG CTGGGATTAC AGGCGCCCCC CAACCACACT CGGCTAATTT TTGTATTTTT 6 96 0 

W AGTAGAGACG GGGTTTCGCT ATGTAGGTCA AGCTGGTTTC AAACTCCTGA CCTCAAATGA 7 02 0 

O TTCGCCCACT TCAGCCTCCC AAAGTGCTGG GATTACAGGT GTGAGCCACC TTGCCTGGCC 7 08 0 

jli AATTTTTGTA TTTTTAGTAG AAACAGGTTT C AC C ATGGTG GCCAGACTGG TCTCAAACTC 714 0 

p CTGACCTCAG GTGAACTGCC CACCTCAGCC TCCCAAAGTA CTGGTATTAC AGGCGTGATC 72 0 0 

r _JL, 

CACTGCGACT GGCCTTGATT TTGTTTTTGA GACAGAATCT TACTCTGTCG GCCAGACTGG 72 6 0 

AGTGCAGTGG CACAATCTCA GCTCACTGCA ACTTCTGCCT CATGGGTTCA AGTGATTCTT 73 2 0 

GTGCCTCTAC CTCCCGAGTA GCCGGGATTA CAGGCACCTG CCATTACGCT AGGCTAATTT 73 8 0 

TTGTATTTTT AGTATAGACA GGGTTTCCCC ACATTGGCCA GGCTGGTCTG GAACTCCTGG 744 0 

GCTCAAGTGA TCCACCTGCT TCAGCCCCTC AGAGTACTGG GATTATAGGT GTGGGCCACC 75 0 0 

ACGCCCATTC AGAAACCTCC ATGTTTTAAG GAGCCCTCTG GGTAACTCTC ATGTTCACCC 7 56 0 

AAGCTGCTGA ACCCTGTCCT GGAGTTTTCA GAGGGACGCG TATGTGCCAC AGAGCGTCCC 762 0 

GCTGGTGGGG GTCATGGGAA GCCATGACCT GGGATAGACA GTCGTCTGTA GAGTGGGGTG 76 8 0 

AACATTCCCT GGGCCCTCTG TTTCATCACT CCTCTTCTCT GTTCCCCCTA CCTCCTGTCC 774 0 

ACAGTGGATA CTGAGGCTGT GTGGCTGCTC CAAGGC TGGC TCTTCCAGCA CCAGCCGCAG 780 0 

TTCTGGGGGC CCGCCCAGAT CAGGGCTGTG CTGGGAGCTG TGCCCCGTGG CCGCCTCCTG 786 0 



GTTCTGGACC TGTTTGCTGA GAGCCAGCCT 

CAGCCCTTCA TCTGGTGCAT GCTGCACAAC 

CTAGAGGCTG TGAACGGAGG CCCAGAAGCT 

GGCACGGGCA TGGCCCCCGA GGGCATCAGC 

GAGCTGGGCT GGCGAAAGGA CCCAGTGCCA 

GCCCGGCGGT ATGGGGTCTC CCACCCGGAC 

AGTGTGTACA ACTGCTCCGG GGAGGCCTGC 

CGGCCGTCCC TACAGATGAA TACCAGCATC 

TGGCGGCTGC TGCTCACATC TGCTCCCTCC 

CTGCTGGACC TCACTCGGCA GGCAGTGCAG 

AGAAGCGCCT ACCTGAGCAA GGAGCTGGCC 

TATGAGCTGC TGCCGGCACT GGACGAGGTG 

AGCTGGCTAG AGCAGGCCCG AGCAGCGGCA 

CAGAACAGCC GCTACCAGCT GACCTTGTGG 

AACAAGCAGC TGGCGGGGTT GGTGGCCAAC 

GAGGCGCTGG TTGACAGTGT GGCCCAGGGC 

AATGTCTTCC AACTGGAGCA GGCCTTCGTT 

CGAGGAGACA CTGTGGACCT GGCCAAGAAG 

GCCGGCTCTT GGTGATAGAT TCGCCACCAC 

GATTCCAGGG CCCAGAGCTG GACAGACATC 

CACGGCCTGC TGGTGGGGTC TGACCTGGGG 

ACCACCCAAA GTGTGGGATT AAAGTACTGT 

GGGTCTGTCA AAATGAGAAG GTCACTGCTG 

TGGCCCTGGG GTGGGACCTG TTCTCCCATC 

TGTTTGTTTG TGACGGAGCC TTGGTCTGTT 

GGCTCACTGC AACCTCCGCC TCCTGGGTTC 

AGCTGGGACT ATAGGCATGC ACCACCACAC 

TCTTGCTCTG TCGCCCAGGT TGGAGTTTAG 
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GTGTATACCC GCACTGCCTC CTTCCAGGGC 7 92 0 
TTTGGGGGAA ACCATGGTCT TTTTGGAGCC 7 98 0 
GCCCGCCTCT TCCCCAACTC C AC C ATGGT A 8 04 0 
CAGAACGAAG TGGTCTATTC CCTCATGGCT 810 0 
GATTTGGCAG CCTGGGTGAC CAGCTTTGCC 816 0 
GCAGGGGCAG CGTGGAGGCT ACTGCTCCGG 82 2 0 
AGGGGCCACA ATCGTAGCCC GCTGGTCAGG 8280 
TGGTACAACC GATCTGATGT GTTTGAGGCC 8 34 0 
CTGGCCACCA GCCCCGCCTT CCGCTACGAC 84 00 
GAGCTGGTCA GCTTGTACTA TGAGGAGGCA 846 0 
TCCCTGTTGA GGGC TGGAGG CGTCCTGGCC 8 52 0 
CTGGCTAGTG ACAGCCGCTT CTTGCTGGGC 8 58 0 
GTCAGTGAGG CCGAGGCCGA TTTCTACGAG 8 64 0 
GGGCCAGAAG GCAACATCCT GGACTATGCC 8 700 
TACTACACCC CTCGCTGGCG GCTTTTCCTG 8 76 0 
ATCCCTTTCC" AACAGCACCA GTTTGACAAA 88 2 0 
CTCAGCAAGC AGAGGTACCC CAGCCAGCCG 88 8 0 
ATCTTCCTCA AATATT AC C C CGGCTGGGTG 8 94 0 
TGGGCCTTGT TTTCCGCTAA TTCCAGGGCA 90 00 
ACAGGATAAC CCAGGCCTGG GAGGAGGCCC 9 06 0 
GGATTGGAGG G AAATG AC C T GCCCTCCACC 912 0 
TTTCTTTCCA CTTAAACTGA TGAGTCCCCT 918 0 
CCACGCTTGG GAGGAC TC AG GGCTATAGCA 924 0 
CCTTGCCTCA CGTCCCTGTT TTTGTTTGTT 93 0 0 
GCCCAGGCTT GAGTACAATG GCACAGTCTC 93 6 0 
AAGCAATTCT TGTGCCTCAG CCTCCCCGGT 942 0 
CAGGCTAATT TTTTTTTTTC CAAGATGGAG 94 8 0 
TGGCACCATA TTGGTTTACT GCAACCTCTG 954 0 
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CCTCCCGGGT TCAAGCAATT CTCCTGCCTC AGTCTACCAG GGAGTTAGGA CTACGGGCCT 96 0 0 
GTGCCATCAC GCCTGGCTAA TTTTTGTATT TTTCATAGAG ATAAGGTTTC ACCATGTTGG 966 0 
CCAGGCTGGT CTTTAACTCC TGAACTCAAG TGATCCACCT GCCTCGGCCT TCCAAAGTGC 972 0 
TGGGATTACA GGAGTGAGCC ACCGTGCCCG GCCATGTCTC TCTTTTTAAC ACTAATGTTA 97 8 0 
CCCTGACCTT TGAACGTAGA ATGCCCTTCT GTTGCAGGAA AACCTCTTTT CAAACCATGT 984 0 
TTGTCCTTTG CTGGCATGCC ACAGCAACAG TCACCAACAC AGAAGACTTC TGTGACCAAA 9900 
TATTTGGAGG ATTTTCCCCA CACACACCAA GCAGCAGACA TCAGCTGGGT GTCCTCCAAT 996 0 
TCAGTTCCAA TGTAATCAAC CAGAGACAGC ATCAGATCCC ACAGGGTTAG GGTGCAGATC 10 020 
CATGAGACCA CCCCCTCCTT CCCAACGGTT ACAAGTCCTG ATCCCTGGAA CTTCTGACTA 10 080 
ACTGGCTTCA AGTTGGAGTT CCCATGACCC CCTTCCCCTC TTTGGAGTCA ACTCATTTGC 1014 0 
GACAGTGACC CACGAAACAC AGGGAAACCC TTATTATGTT TATTGCTTTA TTACAGAGGA 10200 
AAAAAATTTT TTTCTTTCTT TTTTGAGACA GGGTCTCACT CTGTCATCCA GAATGACTGC 10260 
AGTGGCAGGA TCTGGCTCCG TCACCCAGGC TGGAGTGCAG TGGCATGATC TCGGCTCACT 1032 0 
ACAGCCTCCA TCCCCCCCAA ACCCCACGCC TCAGCGCCCC ACCCCGCAAG TGGCTGGGAC 103 8 0 

(2) INFORMATION FOR SEQ ID NO : 4 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 2 0 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( iii ) HYPOTHETICAL : NO 

(v) FRAGMENT TYPE: N-terminal 
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(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE : 

(A) NAME / KEY : Modif ied-site 

(B) LOCATION: 10 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 4 : 

Asp Glu Ala Arg Glu Ala Ala Ala Val Arg Ala Leu Val Ala Ara 

1 5 -. o _ _ 



10 



15 



Leu Leu Gly Pro Gly 
20 



(2) INFORMATION FOR SEQ ID NO : 5 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 18 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE: peptide 

( i i i ) HYPOTHETICAL : NO 

(v) FRAGMENT TYPE: N- terminal 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE : 

(A) NAME/KEY: Modif ied-site , glycosylated or 

phosphorylated, wherein Xaa may be any 
amino acid residue, preferably Arq 

(B) LOCATION: 16 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 5 : 

Lys Pro Gly Leu Asp Thr Tyr Ser Leu Gly Gly Gly Gly Ala Ala Xaa Val 
5 10 15 

Arg 



(2) INFORMATION FOR SEQ ID NO : 6 : 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 15 amino acids 

(B) TYPE: amino acid 
(D) TOPOLOGY: linear 



} 
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(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL: NO 

(v) FRAGMENT TYPE: internal. 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 

(ix) FEATURE: 

(A) NAME /KEY : Modified- si te , glycosylated or 

phosphorylated, wherein Xaa may be any 
amino acid residue, preferably Ala 

(B) LOCATION: 12 

(ix) FEATURE: 

(A) NAME /KEY : Modified- si te , glycosylated or 

phosphorylated, wherein Xaa may be any 
amino acid residue, preferably Ser 

(B) LOCATION: 14 

(xi) SEQUENCE DESCRIPTION: SEQ ID NO : 6 : 



Trp Arg Leu Leu Leu Thr Ser Ala Pro Ser Leu Xaa Thr Xaa Pro 



1 



5 



10 



15 



